explanatory feedback
LLM-Generated Feedback Supports Learning If Learners Choose to Use It
Thomas, Danielle R., Borchers, Conrad, Bhushan, Shambhavi, Gatz, Erin, Gupta, Shivang, Koedinger, Kenneth R.
Large language models (LLMs) are increasingly used to generate feedback, yet their impact on learning remains underexplored, especially compared to existing feedback methods. This study investigates how on-demand LLM-generated explanatory feedback influences learning in seven scenario-based tutor training lessons. Analyzing over 2,600 lesson completions from 885 tutor learners, we compare posttest performance among learners across three groups: learners who received feedback generated by gpt-3.5-turbo, those who declined it, and those without access. All groups received non-LLM corrective feedback. To address potential selection bias-where higher-performing learners may be more inclined to use LLM feedback-we applied propensity scoring. Learners with a higher predicted likelihood of engaging with LLM feedback scored significantly higher at posttest than those with lower propensity. After adjusting for this effect, two out of seven lessons showed statistically significant learning benefits from LLM feedback with standardized effect sizes of 0.28 and 0.33. These moderate effects suggest that the effectiveness of LLM feedback depends on the learners' tendency to seek support. Importantly, LLM feedback did not significantly increase completion time, and learners overwhelmingly rated it as helpful. These findings highlight LLM feedback's potential as a low-cost and scalable way to improve learning on open-ended tasks, particularly in existing systems already providing feedback without LLMs. This work contributes open datasets, LLM prompts, and rubrics to support reproducibility.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Instructional Material (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting > Online (1.00)
Demonstration Based Explainable AI for Learning from Demonstration Methods
Gu, Morris, Croft, Elizabeth, Kulic, Dana
Abstract--Learning from Demonstration (LfD) is a powerful type of machine learning that can allow novices to teach and program robots to complete various tasks. However, the learning process for these systems may still be difficult for novices to interpret and understand, making effective teaching challenging. Explainable artificial intelligence (XAI) aims to address this challenge by explaining a system to the user. In this work, we investigate XAI within LfD by implementing an adaptive explanatory feedback system on an inverse reinforcement learning (IRL) algorithm. The feedback is implemented by demonstrating selected learnt trajectories to users. The system adapts to user teaching by categorizing and then selectively sampling trajectories shown to a user, to show a representative sample of both successful and unsuccessful trajectories. The system was evaluated through a user study with 26 participants teaching a robot a navigation task. The results of the user study demonstrated that the proposed explanatory feedback system can improve robot performance, teaching efficiency and user understanding of the robot.
- Oceania > Australia (0.04)
- Africa > Mozambique > Gaza Province > Xai-Xai (0.04)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.94)
- Research Report > New Finding (0.86)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.85)
How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses
Lin, Jionghao, Han, Zifei, Thomas, Danielle R., Gurung, Ashish, Gupta, Shivang, Aleven, Vincent, Koedinger, Kenneth R.
One-on-one tutoring is widely acknowledged as an effective instructional method, conditioned on qualified tutors. However, the high demand for qualified tutors remains a challenge, often necessitating the training of novice tutors (i.e., trainees) to ensure effective tutoring. Research suggests that providing timely explanatory feedback can facilitate the training process for trainees. However, it presents challenges due to the time-consuming nature of assessing trainee performance by human experts. Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. This system identifies trainees' responses in binary form (i.e., correct/incorrect) and automatically provides template-based feedback with responses appropriately rephrased by the GPT-4 model. We conducted our study on 410 responses from trainees across three training lessons: Giving Effective Praise, Reacting to Errors, and Determining What Students Know. Our findings indicate that: 1) using a few-shot approach, the GPT-4 model effectively identifies correct/incorrect trainees' responses from three training lessons with an average F1 score of 0.84 and an AUC score of 0.85; and 2) using the few-shot approach, the GPT-4 model adeptly rephrases incorrect trainees' responses into desired responses, achieving performance comparable to that of human experts.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- Europe > Switzerland (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting (1.00)
How Can I Improve? Using GPT to Highlight the Desired and Undesired Parts of Open-ended Responses
Lin, Jionghao, Chen, Eason, Han, Zeifei, Gurung, Ashish, Thomas, Danielle R., Tan, Wei, Nguyen, Ngoc Dang, Koedinger, Kenneth R.
Automated explanatory feedback systems play a crucial role in facilitating learning for a large cohort of learners by offering feedback that incorporates explanations, significantly enhancing the learning process. However, delivering such explanatory feedback in real-time poses challenges, particularly when high classification accuracy for domain-specific, nuanced responses is essential. Our study leverages the capabilities of large language models, specifically Generative Pre-Trained Transformers (GPT), to explore a sequence labeling approach focused on identifying components of desired and less desired praise for providing explanatory feedback within a tutor training dataset. Our aim is to equip tutors with actionable, explanatory feedback during online training lessons. To investigate the potential of GPT models for providing the explanatory feedback, we employed two commonly-used approaches: prompting and fine-tuning. To quantify the quality of highlighted praise components identified by GPT models, we introduced a Modified Intersection over Union (M-IoU) score. Our findings demonstrate that: (1) the M-IoU score effectively correlates with human judgment in evaluating sequence quality; (2) using two-shot prompting on GPT-3.5 resulted in decent performance in recognizing effort-based (M-IoU of 0.46) and outcome-based praise (M-IoU of 0.68); and (3) our optimally fine-tuned GPT-3.5 model achieved M-IoU scores of 0.64 for effort-based praise and 0.84 for outcome-based praise, aligning with the satisfaction levels evaluated by human coders. Our results show promise for using GPT models to provide feedback that focuses on specific elements in their open-ended responses that are desirable or could use improvement.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Instructional Material (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting > Online (1.00)
Using Large Language Models to Provide Explanatory Feedback to Human Tutors
Lin, Jionghao, Thomas, Danielle R., Han, Feifei, Gupta, Shivang, Tan, Wei, Nguyen, Ngoc Dang, Koedinger, Kenneth R.
Research demonstrates learners engaging in the process of producing explanations to support their reasoning, can have a positive impact on learning. However, providing learners real-time explanatory feedback often presents challenges related to classification accuracy, particularly in domain-specific environments, containing situationally complex and nuanced responses. We present two approaches for supplying tutors real-time feedback within an online lesson on how to give students effective praise. This work-in-progress demonstrates considerable accuracy in binary classification for corrective feedback of effective, or effort-based (F1 score = 0.811), and ineffective, or outcome-based (F1 score = 0.350), praise responses. More notably, we introduce progress towards an enhanced approach of providing explanatory feedback using large language model-facilitated named entity recognition, which can provide tutors feedback, not only while engaging in lessons, but can potentially suggest real-time tutor moves. Future work involves leveraging large language models for data augmentation to improve accuracy, while also developing an explanatory feedback interface.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Oceania > Australia (0.04)
- Education > Educational Technology > Educational Software > Computer Based Training (0.65)
- Education > Educational Setting > Online (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.77)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.73)